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Abstract 

Quantum state tomography is the task of inferring the state of a quan- 
tum system by appropriate measurements. Since the frequency distribu- 
tions of the outcomes obtained from any finite number of measurements 
will generally deviate from their asymptotic limits, the estimation of the 
state can never be perfectly accurate, thus requiring the specification of 
error bounds. Furthermore, the individual reconstruction of matrix ele- 
ments of the density operator representation of a state may lead to in- 
consistent results (e.g., operators with negative eigenvalues). Here we 
introduce a framework for quantum state tomography that enables the 
computation of accurate and consistent estimates and reliable error bars 
from a finite set of data and show that these have a well-defined and uni- 
versal operational significance. The method does not require any prior 
assumptions about the distribution of the possible states or a specific 
parametrization of the state space. The resulting error bars are tight, 
corresponding to the fundamental limits that quantum theory imposes on 
the precision of measurements. At the same time, the technique is prac- 
tical and particularly well suited for tomography on systems consisting of 
a small number of qubits, which are currently in the focus of interest in 
experimental quantum information science. 

1 Introduction 

The state of a classical system can in principle be determined to arbitrary pre- 
cision by applying a single measurement to it. Any imprecisions are due solely 
to inaccuracies of the measurement technique, but not of fundamental nature. 
This is different in quantum theory. It follows from Heisenberg's uncertainty 
principle [TS] that measurements generally have a random component and that 
individual measurement outcomes only give limited information about the state 
of the system — even if an ideal measurement device is used. 

To illustrate this difference, it is useful to take an information-theoretic per- 
spective. Assume, for instance, that we are presented with a two-level system 
about which we have no prior information except that it has been prepared in 
a pure state, and our task is to determine this state. If the system was classi- 
cal, there are only two possible pure states, and one single bit of information is 
therefore sufficient for its full description. Furthermore, a single measurement 
(observation) of the system suffices to retrieve this bit. If the system was quan- 
tum, however, the situation becomes more interesting. A two-level quantum 



system (a qubit) admits a continuum of pure states that can, for example, be 
parameterized by a point on the Bloch sphere. To determine this point to a 
given accuracy e, at least log2(4/£^) bits of information are necessary]^ Con- 
versely, according to Holevo's bound, any measurement applied to the system 
will provide us with at most one bit of information [13 Hence, even after 
many measurements on (identically prepared copies of) the system, the accu- 
racy to which its state can be predicted always remains finite, necessitating the 
specification of error bars. 

The impact that randomness in measurement data has on the accuracy of 
estimates has been studied extensively in statistics and, in particular, estima- 
tion theory [551 1311 HE] • The latter is concerned with the general problem of 
estimating the value of a parameter from data that depends probabilistically 
on the parameter. The data may be obtained from measurements on a quan- 
tum system with parameter-dependent state, as considered in quantum estima- 
tion theory [TB]. Quantum state tomography can be seen as a special instance 
of quantum estimation, where one aims to estimate a set of parameters large 
enough to determine the system's state completely [TTl |33J 1371 HSl UHl HO] ■ 

An obvious choice of parameters are the matrix elements of the density 
operator representation of states (with respect to a given basis). However, 
because the estimates for the individual matrix elements only have a finite 
accuracy, the overall matrix does generally not correspond to a valid den- 
sity operator (for instance, it may have negative eigenvalues, cf. [l])j^ This 
problem may be avoided with alternative techniques, such as maximum like- 
lihood estimation (MLE) [19l HJ |2Q], which has been widely used in experi- 
ments [221 [35l ISH 131 [121 E] ■ However, the method often leads to estimates that 
lie on the boundary of the set of possible states, which are unlikely in typi- 
cal physical situations. Since, in addition, MLE does not provide error bars 
with operational significance, the estimates cannot generally be used to infer 
reliable claims about the system under investigation, e.g., whether it exhibits 
entanglement (cf. the discussion in [4]). 

Here we take a different approach to quantum state tomography, which 
avoids the problems outlined above. The approach is fully operational, in the 
sense that the estimates as well as the associated error bars have an interpre- 
tation in terms of (in principle measurable) probabilities. Their operational 
significance is independent of any assumptions on the structure of the measured 
states or on the chosen parameterization of state space. In particular, in con- 
trast to standard Bayesian techniques [T^l [Ml [51 [Ml H] , our method does 

^ Covering the area of the (two-dimensional) Bloch sphere requires at least circle discs 
of radius e. 

^In fact, closer inspection reveals that Holevo's bound is far too optimistic in this case, and 
the amount of information provided by each of a series of measurements is usually decreasing 
quickly. For instance, if the state is already known to an accuracy e, a subsequent measurement 
will typically provide only e bits of extra information. 

•^Another possibility is to represent the quantum states as points on the (generalized) Bloch 
ball and use a suitable parameterization of the ball. This may however be problematic for 
systems of dimension larger than two because not all points on the (generalized) Bloch ball 
correspond to valid states (cf. 1261 ). 
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not rely on the choice of a prior. This makes it highly robust so that it can, for 
instance, be applied in the context of quantum cryptography, where the states 
to be estimated are chosen adversarially. 

2 Tomographic Procedure 

In the following we formally introduce the task of quantum state tomography 
and present our method for producing reliable estimates. The basic idea is to 
view quantum state tomography as the task of making predictions: based on 
certain measurements applied to a sample of quantum systems, we would like 
to predict the state of other systems that have not yet been measured. More 
specifically, we consider a scenario consisting of a collection Si, . . . ,Sn+k of 
finite-dimensional quantum systems, for example spin particles prepared in a 
series of experiments. From this collection, a sample consisting of n systems is 
selected at random and measured. The resulting data should then allow us to 
estimate the state of the k remaining systems (see orange part of Fig. [I]). 

Instead of applying the measurement to n randomly chosen systems, one 
may equivalently permute the initial collection of n + k systems at random and 
then measure the first n of them, i.e., 5i,...,5„. In the following, we will 
therefore assume without loss of generality that the joint state of the collection 
Si, ... , Sn+k (before the measurement) is permutation invariant. Formally, this 
state is specified by a density operator on an {n + fc)-fold product space 

ytg,(n+k) ^j^j^ ^Yie property that 

U^p^+'^Ul = (1) 

for any permutation tt G Sn+k, where 11^^ denotes the canonical action of the 
permutation tt on 

In the literature on tomography, it is common to assume that all n + k 
systems are prepared in an identical "unknown" state a. Under this assumption 
(sometimes called iid assumption, for independent and identical distribution) 
their joint state, p""'"*'', is a convex mixture of product states of the form (7'^("+'^\ 
i.e., 

= J P{a)a'^^"+'''^da , (2) 

where P is a probability density function on the set of density operators on H 
(the space associated to any of the individual systems. Si). While our general 
results do not rely on such an assumption, we will refer to it in some of the 
explanations below in order to uphold the traditional iid intuition. 

The effect of the measurements on the sample Si, . . . ,Sn is most generally 
specified by a Positive Operator Valued Measure (POVM) on 7^"^" with elements 
B^. Each then corresponds to a possible sequence of outcomes resulting 
from the measurements on all n systems. In typical realistic scenarios, the n 
systems are measured separately, corresponding to a product POVM with n 
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Figure 1: General scenario. Measurements are applied to a sample 
iSi, . . . , iS„ consisting of n systems, chosen at random from a collection of n + fc 
systems. The outcomes of the measurements are collected and given as in- 
put, -B", to a data analysis procedure, which computes the estimate density fj, 
from which one can infer the state on the remaining k (non-measured) systems 
Sn+i, ■ ■ ■ ,Sn+k- In particular, the estimate density /i can be used to predict 
properties of these systems. To model this, we consider a hypothetical test 
procedure 7^ (blue), which outputs "success" whenever its input has a desired 
property. Given the estimate density /i, we can determine whether the state of 
Sn+i, . • . , Sn+k passes the test. 
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(a) n = 20 



(b) n = 240 



Figure 2: Illustration of the estimate density. The graphs show the esti- 
mate density for measurements performed on n = 20 and n = 240 qubits that, 
for illustration purposes, we assume to be initially in a pure state. Half of the 
qubits have been measured in the z direction and half in the y direction with 
relative frequencies of (0.2, 0.8) and (0.7, 0.3), respectively. One observes a rapid 
decrease in the size of the bright regions which correspond to large values of the 
estimate density. Since the measurement - well-known from the BB84 quantum 
cryptographic protocol - is not tomographically complete, the estimate density 
converges on two states. Note that our tomographic procedure is able to ac- 
curately deal with such a situation whereas the maximum likelihood method 
necessarily will choose one of the two spots, a situation not compatible with the 
operational interpretation put forward in this work. 



parts. However, the restriction to product measurements is not required for our 
treatment. 

The method for quantum state tomography we propose involves a specific 
data analysis procedure. It takes as input the sequence of outcomes (with 
associated POVM element B") of the measurements on the sample 5i, . . . ,5„ 
and invokes an algorithm (described in Section [s]) that calculates the estimate 
density, fi, for the state of the remaining systems iS„+i, . . . , Sn+k- The estimate 
density /i is a probability density function on the set of states of a single system 
S. As we shall see, fi has a well-defined operational meaning and, in particular, 
contains all information that is relevant to determine error bars (see Fig. [2] for 
an illustration). 

Knowing the state of a quantum system allows us to predict its future evo- 
lution. In particular, if we apply a test with two possible outcomes, "success" 
or "failure" , the probability of obtaining each of these outcomes is determined 



by the system's initial state. The converse is also true: if we know the system's 
behavior for arbitrary such tests, we can infer its state. This equivalence be- 
tween the system's state and its behavior under tests is key to the operational 
interpretation of the estimate density fi generated by our data analysis proce- 
dure. The basic idea is to imagine hypothetical tests applied to the systems 
Sn+i, . . . , Sn+k- Our main result then implies that the outcomes of these tests 
can be predicted based solely on the estimate density /i obtained by tomography 
on5i,...,5„ (cf. Fig.[l]). 

In the following, we explain this in more detail, focusing on the practically 
relevant case of an experiment that can in principle be repeated arbitrarily of- 
ten. This corresponds to the limit where k approaches infinity while n, the 
number of actual runs of the experiment from which data is collected, is still 
finite and may be small (the general case of arbitrary k is treated in Section [5]). 
In this limit, the joint state of n -I- fc' systems iSi , . . . , Sn+k' (for k' <^ k) is 

well approximated by a convex mixture of iid states, i.e., a state of the form (|2| 
(this follows from the Quantum de Finetti theorem, which uses that is 
permutation invariant; cf. [HI [3 [HI [31] and Section|6|. Hence, we can equiv- 
alently imagine a preparation procedure that first chooses a density operator 
(7 at random according to a probability density function, P, and then gener- 
ates n + k' identical quantum systems in state a. As before, the measurement 
is carried out on n of these systems, 5i, . . . ,5„. Estimating the state of the 
remaining systems, iS„+i, . . . ,Sn+k', is then equivalent to guessing the choice 
tr made by the preparation procedure. In particular, the hypothetical tests on 
Sn+i, . • . , Sn+k' can be replaced by ones that check whether the density operator 
tr is contained in a given set (see Fig. [s]) . 

We will now argue that the output of our quantum state tomography proce- 
dure, the estimate density /z, allows us to estimate cr in a well-defined sense. For 
this we fix an e > and consider a subset F^j of the state space of the individual 
systems such that F^ has high probability with respect to fj,, more precisely a 
probability exceeding 1 — e' for some e' depending on e as in Q . It then follows 
from our main result that, except with failure probability at most e, the state 
(7 (chosen by the preparation procedure) is indeed contained in the set F^ or 
(5-close to it (see (11) for details). 

The estimate density /i generated by the data analysis procedure has there- 
fore a specific operational meaning: it characterizes the sets F^ which are likely 
to contain the state a chosen by the preparation procedure. 

We emphasize that the above claim is valid independently of the initial prob- 
ability, P, by which a is chosen. In other words, the operational interpretation 
of the sets F^ does not depend on any extra assumptions about the preparation 
procedure or on the specification of a prior. This makes our procedure reliable 
and robust. In fact, a could even be chosen "maliciously", for example in a 
quantum cryptographic context, where an adversary may try to pretend that a 
system has certain properties (e.g., that its state is entangled while in reality it 
is not). 

The accuracy of the estimate inferred from ^ can be measured in terms of 
the size of the sets F^ that have high probability: The smaller the sets F^ 
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Figure 3: Tomography of identically prepared systems. This scenario can 
be seen as a special case of the one described in Fig. [T] corresponding to the 
limit where the number of extra systems, k, approaches infinity. In this case, 
we can without loss of generality assume that the n systems to be measured 
are prepared in a two-step process: In the first, a description "cr" is sampled at 
random (according to some probability density function P on the set of states 
of a single system). Next, the description "cr" is forwarded to a quantum state 
generation device which produces n identical systems 5i, . . . , iS„ in state a. The 
k extra systems of Fig. [T] are replaced by a classical variable that carries the 
description "cr", and the test procedure simply checks whether this state is 
contained in a specific set, F^. Given the output of the data analysis procedure, 
/i, it is possible to predict whether this (hypothetical) test is successful. 
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the more accurately is the state a specified. Fig. [4] shows the radius of the 
sets depending on the number of experiments, n, for the case of complete 
tomography on a single qubit. 

We note that the estimate density ^ output by our data analysis procedure 
can be efficiently specified using a representation in terms of Fourier compo- 
nents (or, spherical harmonics), as described in Section [sj Data obtained from 
measurements on n systems specify exactly the first n moments of this repre- 
sentation. In particular, these moments contain all information that is needed 
for a later update of the estimate density based on additional measurement data 
(see Fig. [5]). 



3 Definition and Representation of the Estimate 
Density 

The following discussion refers to the scenario depicted in Fig. [T] The measure- 
ments on the n systems Si, . . . ,Sn are most generally described by a POVM 
{B"} on 7^®" (i.e., a family of positive semi-definite operators i?" on "H®" such 
that J2b^ = ^"^y possible measurement outcome (i.e., the joint out- 

come of the measurements on all n systems) corresponds to an element of 
this POVM. The estimate density /i can then be computed as a function of this 
outcome (see Fig. [2] for an illustration). 

Formally, fj, is defined as a probability density on the state space associated 
to H 

= ^tr[a«"S"] , (3) 

where 



d? 

and K, = H = and lsym"(«(»)C) denotes the identity on the symmetric 
subspace Sym"(H (g> IC) of (H (g> /C)^"Q 

It will be convenient to work in a purified picture and to define the prob- 
ability density on the pure states on "H (g) /C which can be identified with the 
complex projective space CP"^ 

lys-ix) := — tr[|x)(x|«" • ^ . (4) 

that has the property that for all subsets A of the states on H: 

/ ^B^^[<y)d(7 := I VB^{x)dx . (5) 

JA J x:trK.{\x)(x\)&A 

"Note that dimSym"(W ® K) = ("i'^!^^)- 
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(a) 



y"(0-) 




(b) 

Figure 4: Convergence of the estimate accuracy. Graph (a) shows how 
the size of the approximating sets (for a fixed value of e = 10~^) shrinks in 
terms of the number of measurements, n, for the case of Pauli measurements 
on a qubit. More precisely, we plotted an estimate A to the radius of the set 
F^ in units of the purified distance which equals \/l — F, where F denotes 
the fidelity. Interpreted in our scenario with fc — > oo (see Fig. [sf, this means 
that we can specify the state a chosen by the preparation procedure up to 
an accuracy A, except with probability e. Fhe relative frequencies of the x, 
y and z measurements used for the plot are ca. (0.78,0.22), (0.56,0.44) and 
(0.91, 0.09), and have been obtained from a simulation of a measurement. Graph 
(b) schematically illustrates the estimate density fi, the high probability set F^, 
as well as the (5- region around F„. A is an estimate on the radius of F* . 
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Figure 5: Updating the estimate density. The estimate density computed 
by the tomography procedure contains all relevant information that can be 
inferred from the measurement outcomes (with associated POVM element 
In particular, if additional measurement data become available (with associated 
POVM element ), an updated estimate density /x' can be computed solely 
from fi and B" , without reverting to i?" (see Section |4|. 



The measure da on the set of mixed qudits is the measure induced by the 
Hilbert-Schmidt inner product. It is explicitly given by [HI eqs. (3.5-3.7)] 

da = f[r)drdg 

where a = g(i\&g{r)g\ dr ~ IliLi ^''^ the Lebesgue measure on the eigenvalues 
in the cube [0, l]''^^ and 

llj=ou ^ '-i-J- i<i<j<d 

for dg the Haar measure on U{d) with J dg — 1. For notational simplicity we 

introduced the variable 1 ~ X^iLi ^i- 

As explained in Appendix [D] z^bh can be decomposed into Fourier compo- 
nents 

where i/b" (€, m) are complex coefhcients and yi^m are orthonormal functions 

,2 1 

on CP ~ labelled by a moment i as well as an additional index m. These 
can be viewed as higher-dimensional generalizations of spherical harmonics on 
the Bloch sphere. The sum is restricted to moments ^ of degree at most n. 



10 



Since for each moment £ the number of possible values m is finite, the sum is 
over finitely many terms. In particular, ly is specified by a finite number of 
coefficients i^B^i^j^)- It is therefore sufficient for the data analysis procedure 
to compute these coefHcients, which according to the above considerations are 
given explicitly as 



VB^{i,m) ^ / VB^A,f',m)yi^rn{x)dx . (6) 



4 Updating the Estimate Density 

Consider the scenario depicted by Fig. [5] where the task is to update an earlier 
estimate density /i^n using additional measurement data i?" . 

Since the post-measurement state after the first measurement on the n' sys- 
tems has the form 

MBn((T)cr®"'do- 

one may interpret the estimate density as the P-representation of this state 
similar to the well-known P-representation for states of light. Likewise we can 
introduce a Q-representation of the measurement operator 

Qs^,{a) :=tr[a«"'i?"']. 

The update rule for the estimate density then takes the following form: 

Note that this update rule holds more generally as an update rule for P- 
representations (see Appendix |b] where the existence of such representations 
is proven for every density operator). 

We will now rewrite this update rule as an update rule for the Fourier com- 
ponents. This is more convenient than updating continuous functions. Let 
vb^ {£, m) be the Fourier components of the density i^b^ Tn) as above and 
define z/^„'(£',m') analogously. Furthermore, let q^n' {£" ,m") be the Fourier 
components of the function tr[|a;)(a;|'^" • (E) 1*^"]. As a consequence of 
Lemma [TJ the update rule for the Fourier coefhcients then reads 



CB" 



.,m I'm 



i e e" 

m ml to" 



£ £' £" 

where the coefficients { in)- are related to the Clebsch-Gordan co- 



TO TO TO 

efficients of U{d). Note that we have n^ax — n + n'. That is, the description 
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size of the estimate density can at most grow by the maximal degree of the 
Q representation of the POVM element. Furthermore, if the initial estimate 
density is uniform, i.e., where i>{£,m) = Si_oSm,o, (e.g. if n = 0) we retrieve the 
special case Vg^i{£,m) — —^qg„/{£,m) as discussed earlier (seeQ). 

5 Operational Interpretation 

In this section we describe the operational significance of the estimate density 
/z in the general scenario depicted in Fig. [T] As above, we will refer to ancilla 
systems IC = H associated to each system H. Operationally, this will allow us 
to separate randomness in the actual states we are estimating (e.g., the state 
a in the main text) from randomness that is due to the lack of knowledge of 
the precise state (i.e., the randomness introduced by the preparation procedure 
when choosing a according to the probability density P) . Let p"'^'' be the initial 
state of the systems Si, ... , Sn+k before the measurements which we can assume 
without loss of generality to be invariant under permutations (cf. Eq. [T]) . This 
state has a canonical purification q"-+^ supported on Sym"('H <8) JC) (see, e.g.. 
Lemma II. 5 of 0). 

To formulate our general result, we consider test procedures T which take 
as input the state of the k systems iSn+i, • ■ • iSn+k ~ and, strengthening the 
analysis further - purifying systems 7?.„+i, . . . , 7?.„+fc and output either "success" 
or "fail" . Any such procedure can be specified by a POVM with two elements, 
{Tfaii, l^^;,-— Tfaii}, and the probabihty that it fails for an input state is given 
by tr(Tfaiig'^). In the following we will relate this probability to the probability 
of failure for a tensor product state, i.e. to iYiTi^wil)®^') . 

Let again be the POVM element associated to the outcome of the to- 
mographic measurements on iSi, . . . ,iS„, and let p, = fis" be the corresponding 
estimate density computed by the data analysis procedure. By Born's rule the 
probability of each outcome is given by tr[B"p"] and the state of the systems 
Sn+i, . . . Sn+k,T^n+i, ■ ■ ■ Ti-n+k, Conditioned on outcome is 

where tr„ denotes the partial trace on the first n systems. 

Assume now that, depending on the output of the data analysis proce- 
dure, we apply a test r^^„ with POVM elements {Tj^g" , 1^'' - T^^^^" } to the 
k remaining systems. We are interested in an upper bound, e, on the failure 
probability of this test, 

HTZ"gU)s^<e, (7) 

where ( • denotes the expectation taken over all possible measurement out- 
comes B" according to the probability distribution tr[i3"p"]. 

Our main result (Theorem fl]) provides a sufficient criterion for (It]) to hold. 
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namely that 

J /^S" {tini^MTZ" < £' (8) 

is satisfied for all probability densities /is", where 



rf2 - 1 



(9) 



d is the dimension of the Hilbert space H associated to a single system S, and 
dip is the probability measure on the set of states on H that is invariant under 
the action of the unitary group. In typical situations, e decreases exponentially 

2 —1 

with n. The additional factor {^~^'^2'^i~^) in (12), which is inverse polynomial 
in n + k, plays therefore only a minor role in the criterion. 

We note that this statement does not rely on any assumptions about the 
initial state (except its invariance under permutations, which is equiva- 

lent to the requirement that the sample to which the measurement is applied is 
chosen at random). In particular, the assumption that many copies of an identi- 
cal state have been generated — although common in the literature on quantum 
tomography — is not necessary. 



6 High Probability Sets 

In this section we describe the operational significance of the estimate density 
fi in the practically relevant case depicted in Fig. [3] where k, the number of 
systems remaining after the measurement, tends to infinity. Since the initial 
state of all n-|-fc systems is permutation invariant, the Quantum de Finetti 
theorem [3TJ [511 [7| |H1 [31] implies that for any fixed k' G N, the marginal state 
pU+k n + k' systems is well approximated by a mixture of product states, 
i.e., 

= trfc_fc,(p"+'=) « J P(a)a«("+'=')da , (10) 

for some probability density function P. The approximation can be quantified 
in terms of the trace distance between the two states, and can be bounded by 
an expression that decreases proportionally to 1/k. In other words, in the limit 
of large fc, the marginal state is fully specified by the probability density 

function P. This state may have been generated by a preparation procedure 
that samples a according to P and then generates many copies of this state. 

We now consider a test that determines whether the state a chosen by the 
preparation procedure is contained in a given set which has high probability 
with respect to /i: 
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It turns out that this test can be approximated very weh aheady for A:' = n by 
a certain test Tf^jj derived from Holevo's optimal covariant measurement. More 
precisely, the test will determine whether cr is in a (5-neighborhood 

rl^{a: 3a' G with F{a, a')>l~ S^} (11) 

of the set F^,, where F{a,a') := {ix \f^Jaa\/aY is the fidelity. Choosing E = 
y ^ (In I + 2 In (^"^IV^) ) we find that the test cr e fails with probability at 
most e (Theorem [2]) . 

Obviously, the assertion that a state a is contained in a certain set F^ can 
only be considered a good approximation of a if the set F^ is small. This is 
indeed the case, at least for reasonable choices of the measurement {-B"}- We 
refer to Fig. [4] for more details regarding the convergence of the approximation. 

7 Conclusion 

While our technique is different in spirit and in practice from both MLE as well 
as Bayesian approaches (such as the Bayesian mean estimation method 
there are several interesting connections between them (see Appendix[F|. Firstly, 
one can show that the maximum of the estimate density [i coincides with the 
estimate obtained by the MLE method]^ Secondly, the probability density [i 
corresponds to that obtained from applying Bayes' updating rule to a uniform 
distribution over the states (with respect to the probability measure induced 
by the Hilbert-Schmidt inner product). In particular, optimality results for 
Bayesian inference with uniform priors (see for example [38j ) imply optimal- 
ity of our method. Conversely, the mathematical techniques developed in this 
work can be employed to calculate the updating of permutation invariant states 
according to Bayes' rule. 

Finally, we note that, for information-theoretic reasons, the size of the sample 
needed for full tomography increases exponentially in the system size. However, 
prior information about the system, e.g., about its symmetries, can be used to 
reduce the number of relevant degrees of freedom. Our method allows to take 
into account such prior information. In particular, the prior information may 
be given in terms of an initial estimate density /i that is then updated using the 
tomographic data (cf . Fig. [5| . 
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A Technical Analysis of Tomographic Procedure 

In the following let H denote a finite dimensional Hilbert space of dimension 
d, that is "H = C''. We denote the set density matrices on H hy S{'H) and the 
subset of pure states by 'P{'H). Note that V{'H) can be identified with CP'^~^, 
the complex projective space of dimension d— 1. CP'^^^ carries a natural action 
of the unitary group U{d). The Haar measure on U{d) therefore descends to 
a measure on VCH) which is invariant under the action of U{d). We denote 
this measure by dip and fix the normalisation so that J d(j) = 1. The symmetric 
subspace Sym"('H) of 'H'^" is defined as the space of vectors that are invariant 
under the action of the symmetric group Sn that permutes the tensor factors 
(see main text). Since the action of Sn commutes with the action of the unitary 
group on V.^^, U{d) acts on Sym"('H) as well. We denote the dimension of 
Sym"'('H) by dim(n, d). 

The following lemma is crucial in the derivation of the main results. An 
extension of this lemma is known as the "postselection technique for quantum 
channels" [S]. 

Lemma 1. Let e 5(Sym"(C'^)). Then 

p" < dim(n, d) [ <j)'^"d<j). 



Furthermore, dim(n,d) = ("+^~^) < {n + l)'^^'^. 

Proof. The space Sym"(C'') is irreducible under the action of the unitary group 
U{d) [H]. The operator / (p'^^dcj) is supported on Sym"(C'') and invariant under 
the action of U {d) . By Schur's lemma we therefore have 



dim(n,d) j (f"' d<j) = l^^^-n f^^dy 

The claim follows since 

P" < lsym"(C<i) 

holds for any density operator. dim(n,(i) equals ("^^^^) and is easily seen to 
be upper bounded by [n -\- — poly(ri). □ 

Consider the measure dcf) on a tensor product space "H (8) /C = (g) C' and 
perform the partial trace operation over system /C. We denote the resulting mea- 
sure by da on iS('H) and note that it may also be defined as the measure induced 
by the Hilbert-Schmidt metric 44J. For a (permutation- invariant) POVM ele- 
ment B" on "H*^", our data analysis procedure produces the following estimate 
density 

— tr[BV«"] = tr[i3" ® l|" • lsym"(««;c)], 
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where cb^ — J tiB'^a'^'^da. Let be a permutation-invariant density op- 

erator on T^"*" and a purification with support on Syni"('H (8) JC), where 

JC =T-L (see e.g. |8]). That is, is pure and tr/c®»i = Denote by 

g%„ = tr[B"p"] t''»[(^" ® 1«|.k; ■ fi'"'^''] post-measurement state and 

note that it appears with probabiUty tr[i3"/9"]. 

Theorem 1. ///or a/Z POVM elements and aZ; POVM elements T^^Jl" we 
have 

f,B^ (tr^v^)tr(T,^r r')d^ < £ (" + J " V ' ^^^^ 

</ien t/ie expected failure probability of the test is bounded by e, that is 

War-e|.])s„<£. (13) 

Proof. 



< dim(n -f /c, d^) J2 I tr[(B" (g) ® Tf^?J" • 
= dim(n + A;,d2)^p5„ /" A^s. (tr;cV')tr[TfX • 



where we used Lemma [I] in the first inequality and the assumption, (12), in the 
second. □ 

For a subset of 5(H), we define 

tI^{<t : 3cr' e F^ with F{a., (t')>1- 5"^}, 



where the fidelity is defined as F{a,a') = (tr-\/ y^a' ^/a)^ . Note F^ contains all 
states that are not more than 5 away in the purified distance C{a, cr') which is 
defined as -y/l — F{a, a'). For more details regarding this distance measures as 
well as its relation to the trace distance see [39] , 

Theorem 2. For all ji, let F^ be such that 



and let 5 y |-(ln | + 21n (^"jj'ij^ ^)). Then for all initial distributions P{cr) 

leading to the state p" — J P{a)(T^^da, the test ct G F^ fails with probability at 
most e. 
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Note that it suffices that (14) holds for aU /i = fiB"- 
Proof. The failure probability of the test is given by 

Pfaii(P) := / P(a)Vtr[B".a«"]ej^(a)da 



where 8j=^ — (cr) equals one for a in the set T^^^^ and zero otherwise {A denotes 

Mb" 

the complement of a subset A of 5(H)). This can be rewritten more conveniently 



as 



Instead of the a priori probability density P{(j) we consider a probability density 
Q{'tp) on the pure states V{%®JC), where /C = that gives rise to P[<7). That 
is, Q(^) satisfies 

[ P{a)da= f Q{^P)d^ 

J A J ^-.tVK.ipeA 

for all measurable subsets A of S{H). Note that such a probability density Q{ip) 
always exists as we can purify the state of the particle with a purifying space 
K- ^H. ^i{(7) is replaced by 

v{il)) := — tr[(B" (S) l|") • V®"]- 

Furthermore we consider the following extension of the set F^: 

v{n /c) : ti-ici' e f^} 

and of F^: 

The failure probability of the test can then be expressed as 

Pfail(P) = VCB,. [ Q(^)z/B"(V')dV'- (15) 

Let fc' G N be the number of systems that we use to approximate the test by a 
test procedure that is given by a POVM {Tfaii, 1 — Jfaii}. Defining 



5/2 



we see that for all ■0 G 



trT^fij^''' = 1 - dim{k', / tr[</>^'=>®'=']d'^ 

> 1 - dim(/c', d^) max F(trK;0, trycV')''' 

> l-dim(A:',d^)e-Tfe'. 
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Inserting this estimate into ( 15 ) leads to 

We now remove the restriction in the integral, thereby further weakening the 
estimate and obtain 

Pfaii(P) < e" + E / Q(V^)^S"(V')tr[lfF" • ^'^''W, 



Applying Lemma [l]to the state / Q(V')V'®"+'='dV 

we can find an upper bound 
on this quantity which is independent of the initial distribution P{a) (or Q{tp))- 



Pfaii(P) < e" + dim(n + fc', d^) ^ tr[(S" ® if") ® T^^;';-" • ( / 

= e" + dim(n + k' , (f) ^ cs- j J/s- (V')tr[Tf^p • V^'^VV' 

+ dim(n + fc', E CB. / UB- ii^M^Sr ■ V'^'^V^- 

Since for V' G ^/^ij™ 

tr[Tf^i^^' • V'®'''] < dim(/e',d2) m_a^F(trK(/), trK;^)'''' < dini(fc', ^2)6^4 fc' = g" 
and since 
we find 

^faii(-P) < £" + dim(n + /c',rf2)e" + dim(7i + fc',d2) Vcsn / i^B'-Wdip 

= e" + dim(n + A:', (f )e" + dim(n + fc', (f) ^ cb" / A^S" (cr)dcr. 

B" •^'"('is" 

We now set k' ^ n and use the assumption 

/ ^Bn(CT)d(T > 1 - |dim(2n,d2)-\ 
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for all /i ■ This results in 

Pfaii(P) < e" +dim(2n,d2)e" + | < dim(2n, d2)2e-T» + |. 

Choosing S = ^ (In | + 2 In dim(2n, cP)) ensures that 

Pf.ii{P) < e. 

□ 

B Quasi-Probability Distributions 

In this section we derive quasi-probability distribution representations for oper- 
ators on Syni"(C'^) similar to the P- and Q-representations that are well-known 
from quantum optics. 

Theorem 3 (Q-representation). Let B he an operator on Sym"'(C'^). Then B 
is uniquely determined by its Q-representation, the function 

Qeix) = {x\^^B\x)^^, 

where \x) — ^ — {^1t ■ ■ T^dY' ^ anrf ^Jx^p = 1. When conve- 

nient we will view Q as a function on CP'^~^ . 

Proof. We adopt an argument very similar to the one used in [301 p. 30] in the 
context of Glauber coherent states. Note that the values 

for X G are determined by (a;|®"B|x)«'" for a; G C with J2i\^i\'^ = 1- 
It therefore suffices to show that the values (a;|®"i3|a;)®", x G determine 
(a;|®"B|a;')®", x,x' e C uniquely. Let {|m)} be the Gelfand-Zetlin basis for 
Sym"(C'^) (see Appendix [d]) and write 

B = ^ Bjn,m'\m){m'\. 

m,m' 

The function {m'\x')^" is a polynomial in x' G and {x\^"\m) is a polynomial 
in X, the complex conjugate of x. Defining 

X + x' .X ~ x' 



we see that 

x' = a + X — a — i/S 

and hence (a;p"B|a:')®" = Em.m' (x|^"|m) (m'la;')®" is a polynomial in 

a, /3. Note that every polynomial (in fact every entire function) is determined 
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by its values for real parameters, i.e. by a, /? S R'^ in our case. This can be seen 
by writing the polynomial in the form of a Taylor series (around a real point, 
e.g. 0). The coefficients in this series are partial derivatives (evaluated at 0), 
which can be taken in real directions without losing generality and are therefore 
only dependent on real values of the polynomial. Since for real a, P 

Im(a) = i(-Im(.T) +Im(a:')) = 

Im(/3) = ^(Re(a;) - Re(a;')) 

or, in other words x = x' , it follows that as a function of 

wholly determined by the values {xf"'B\x)^'', a; € C^. □ 

Theorem 4 (P-representation) . Let B he an operator on Sym"(C''). Then B 
may he represented in the form 

B^ [ PBix)\x){x\'^''dx 

where 

Pb{x) = ^^pB{i,rn)ye^„i{x) 
with the second sum extending over the Gelfand-Zetlin hasis for the irreducihle 



representation of U{d) with highest weight (£,0, . . . ,0, —i) (see Appendix 



D) 



d 

The constants ps {(, rn) are uniquely determined by B for £ < n and are arbitrary 
otherwise. 

Two lemmas will be needed in order to prove the theorem. 

Lemma 2. (j27l p. 35]) Let D he the space of operators on Sym"(C'') that can 
he represented in the form 



B= / PBix)\x){x\'^''dx 

for some Pb G L'^{CP'^^^). Furthermore let E he the space of operators on 
Sym"'(C'') with vanishing Q-representation. Then — E, where ~ {A : 
trAB^ =0 VB e £>}. 

Proof liAeE, then for all B e D 



tiAB'f =tTA J PBix)\x){x\'^"dx = j PB(a;)(xp"A| 

hence A £ . Conversely let A £ D-^, then 

trAB^ =0 yB eD. 



= 0, 
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Writing this out results in / PB{x){x'^"\A\x'^'^)dx = 0, for all functions Pb{x) 
on CP'^^^. This implies 



since only the identically vanishing function is orthogonal to all square integrable 
functions on CP'^~^ (and, in particular, to itself). □ 

Lemma 3. The operators J dx yi^mix)\x) {x\'^"' are nonvanishing and orthogo- 
nal with respect to the Hilbert- Schmidt inner product for £ < n and m a corre- 
sponding Gelfand-Zetlin pattern. For I > n, J dx ye.mix)\x){x\'^" = 0. 

Proof. We calculate 



tr[ / dx ytmix)\x){x 



(gin 



dz ye^rn{z)\z){z\^- 



yi,rn{x)yi'^m'{z)\{x\z)\ dxdz 



yiMi{x)ye,,n'{z) 

1 



dini(n, df 



E 



V V* A" 




m" 



'{x)yi"^m"{z)dxdz 



Sfj'S, 



1 



dim(7i, dy 



V V 





A'' 



where we have used Corollary [2] (Appendix|D]) in the second equality sign and the 
orthonormality of the y functions in the third equality sign. Since mult(i', n, c?) 
is nonzero for £ < n and vanishes for £ > n (see Corollary [2| , this concludes the 
proof. □ 

Proof of Theorem\^ By Theorem |3j {xfA\x)®'^ = for all x imphes ^ = 0. 
Therefore the operator space E from Lemma [2] contains only the identically 
vanishing operator. As a consequence, the space D of operators that have a 
P-representation equals the space of operators on Sym"(C'') which proves the 
first part of the claim. 

The Fourier decomposition of Pb (see Appendix [Pj) 

Pb{x) = y^^pB{£,m)yi,m{x) 



implies the decomposition 



5 = ^PB(^,m) (^j dx 



yt,M\x){x\®'' 



From Lemma [s] we see that the coefficients pb {£, m) are determined by the 
operator B for £ < n and are arbitrary for £ > n. □ 
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C Examples 



C.l Holevo's Covariant Measurement 

It was shown by Holevo that an optimal measurement procedure (in terms of the 
fidehty) for state estimation is given by the POVM [Ml p. 163]. 

We now want to analyse this measurement with our methods. Let us start by 
assuming that we have measured the effect giving rise to an estimate 

density 



M|d>(d|»"(a^) = dim(7i,d)|(a;|d)|^ 



We find 



lim /i fe 

n— >-oo '^idXdl' 



(x) = S{x), 



(16) 



since dim{n, d)\{x\d)\'^" converges to the ^-distribution. This, as expected, re- 
flects the fact that the scheme is asymptotically correct. If we measured \z) (zj*^" 
instead of \d)(d\'^", we find 



with 



= dim(7i,d)|(a;|z)|^ 



lim n„k (x) = d(z — x). 



(17) 



Let us now do the analysis in terms of Fourier coefficients: Comparing ( 16 1 



with S{x) = ^£ yi,oix) we see that the Fourier coefficients of dim(n, (i)|((i|a 
must all converge to one. Explicitly, the latter are given by (see Corollary [2]) 

1 



dim(n, d)\{d\x) \ 



2n 



dim(n, d) 



E 



1^ ly* X 




(18) 



where A = {£,0, . . . ,0, ~£), v — (n, 0, . . . , 0) and v* denotes highest weight dual 
to V. More generally, we have (see Corollary [s]) 



dim(n,d)|(z|x)p" ^ - 



1 



dim(n, d) 



E 



U V* \ 





^yi,7n{z)yi^jn{x). (19) 



We conclude this example with a formula for the Fourier coefficients for the 
qubit case and derive from it explicit bounds on the convergence the value one 
(Corollary |4]): 



1 



u V* \ 




n\{n + 1)! 
(n-^)!(n + ^+ 1)!' 



dim(n, 2) 

for I < n and zero otherwise. For small n and £, we have 
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1 

_5_ 

1 
10 



1 

21 

1 

1 
35 



and in general there is the bound (Corollary |4]) 



1 



£{£ + !) 



< 



1 



dim(n, 2) 



1 

84 

126 
36 
126 

9 
126 

T26 



I' 




A 




< 1. 



This example shows how one may perform a convergence analysis of a tomo- 
graphic measurement in terms of the Fourier coefficients of the of the estimate 
density. The convergence of the Fourier coefficients to a constant value (for fixed 
i and n — oo) is also in agreement with our intuition about the duality of the 
Fourier transform: more information about x corresponds to less information 
about (£, to). 



C.2 Basis Measurements 

We will now analyse the case where a product measurement is carried out with 
the measurement in a single system given by an orthonormal basis. 

Assume that the basis in which we measure is the computational basis 
{\i){i\}f- Since the state we are measuring lives in the symmetric subspace, 
we can, without loss of generality, project the effect onto this subspace 
and obtain the projector onto the vector with weight / in the representation 
V^. This vector, denoted by \v,m) is unique and has Gelfand-Zethn pattern 
TO = (to^**-!), . . . ,to(i)) for to(*) = (Ei=d+i-» /^^^0' ...,0). The estimate den- 
sity is therefore given by 

Msy (a;) = dim(r7,, d)\{x\iy, to)^, 

where v = (rt, 0, . . . , 0). By Lemmajsjwe have 

^^?(^'"^')='^™'°dhilkd) 

where A= {£,0, . . . ,0, -£). 

We will now compute these coefficients for qubits, d = 2, in the case where 
we have measured an equal number, namely, ^, Is and 2s (i.e. the case to = ^). 
We will use this formula to show that estimate density converges to the uniform 
distribution on the equator of the Bloch sphere. It follows from Corollary |4] that 
for £ and n/2 even: 



ly V* X 

TO —TO 
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For large n, these coefficients turn into 




which are the Fourier coefficients of the uniform distribution on the equator of 
the Bloch sphere by Lemma [TO] The estimate density therefore concentrates 
on the equator just as expected, since we do not obtain any information on the 
phase of the state from this measurement. 

This example shows that Fourier analysis is able to trace a complicated 
convergence behaviour in a compact way. When several bases are used (such as 
in the BB84 or the six-state protocols for quantum key distribution) one can use 
the just derived formula together Lemma [6]- which allows to rotate the basis - 
in the update rule. 

D Spherical Harmonics for Higher Dimensions 

As we have seen, the functions on S{H), V{T-L) and therefore CP'^^^ play a 
central role in the present work. In this section, we perform a Fourier decompo- 
sition of the functions defined on CP'^~^ and derive properties that will enable 
us to work with these functions very effectively. Whereas this may be considered 
standard by some readers, we include it for the benefit of completeness. Our 
construction of an orthonormal basis of functions on CP'^~^ uses the represen- 
tation theory of U{d) and its subgroup U{d— 1) x U{1). As general references 
on representation theory of the unitary group we recommend |131 |B] . 

A (complex) representation ^ of a group G is a finite-dimensional complex 
vector space V, equipped with an action of G preserving the group operation. 

V is irreducible if the only invariant subspaces of V are the empty subspace and 

V itself. For H a subgroup of a group G, let V |§ denote the restriction of a 
representation of G to H. 

Let G = U{d), V a (holomorphic) representation of U{d), i.e. a representa- 
tion whose representing matrices have entries that are holomorphic functions in 
the variables of U{d), and let H — T{d) be the torus of diagonal matrices in 
U{d). V decomposes according to 

w 

where are the isotypic components of the irreducible representations of T{d). 
Since T[d) is abelian its irreducible representations are one-dimensional. Vectors 
in Ww are known as weight vectors with weight w = (wi, . . . , iwd), Wi G Z, that 
is, for all \v) e Ww: 

T{d)Bt:\v)^t\v)=t^^---t^^\v), 

where t = diag(<i, . . . ,td). The w with dim VKu, > are called weights of V. 
The lexicographical ordering on the set of weights is the relation w > w' ii 
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for the smallest i with Wi ^ Wi > w[. It turns out that every irreducible 
representation V of U{d) has a unique highest weight A satisfying dim Wa = 1- 
A is furthermore dominant, i.e. A = (Ai, . . . , A^) satisfies Aj > A^+i. To every 
dominant A, there also exists an irreducible representation denoted by V\. Two 
irreducible representations Vx and Vy are equivalent if and only if A = A'. 
In the case where G = U{d) and H = U{d — 1) (embedded as H B h 



h 
1 



G G and using the definition \i) = (0, . . . , 0, 1, 0, . . . , 0) for all i 



and where the representation of U{d) is irreducible with highest weight A one 
has the following decomposition, known as the branching rule for U{d): 

y^iu[U=®V. (20) 

where the sum extends over dominant weights /x = {/j-i, . . . , /j-d-i) that are 
interlaced by A, i.e. that satisfy 

Ai+i<Mi<Ai Vi e {!,..., - 1} . (21) 

Iteratively using the branching rule allows us to define an orthonormal basis 
of the representation Vx, called Gelfand-Zetlin basis, where any basis vector is 
labeled by a sequence of d diagrams X^'^\ X^'^~^\ . . . , X^^^ such that A^*"*"^) is 

" ' 

—A =:m 

interlaced by A^'\ m is called a Gelfand-Zetlin pattern for A. The state with 
Gelfand-Zetlin pattern m = ((O'^-i), (O'^-^), . . . , (0)), where (0*) = (0,...,0), 

i 

will be abbreviated by m = 0. The corresponding state is |A,0). 

We denote by dg the volume element of the Haar measure on U{d) with 
normalisation J dg = 1. We now consider the Hilbert space of square integrable 
functions on U {d) with the inner product 



/ 



aig)P{g)dg, 



for two functions a{g) and /3(g). L^([/ (d)) carries a representation of U{d)xU (d) 
when equipped with the action 

U{d) X U{d) B (51,5(2) : a{g) a{g^^gg2). 

Let 

t\,m,m'{g) ■■= dx{X,m\g\X,m') 

be the characteristic (or representative) functions, i.e. the matrix elements of 
the irreducible representations of U{d) (multiplied hy dx ■= dimV^)- Note 
that these functions are orthonormal with respect to the above defined inner 
product and - for fixed A span an irreducible representation of U{d) x U{d) 
with a pair of highest weights (A*, A), where the A* denotes the highest weight 



25 



of the , the representation dual to V\ . It is not difficult to check that A* = 
(—Ad, . . . , — Ai). The Peter- Weyl theorem asserts that these functions are dense 
in the L^{U{d)). Note that one can interpret this theorem as a Fourier theorem 
on U{d) as it shows that any square integrable function can be expressed as a 
linear combination of the basis functions tA,m,m'- In the following we want to 
derive a similar statement for functions on CP'^~^. 

When a group G acts transitively on a set X one can identify X with the set 
G/H of left-cosets of the stabiliser group _ff of a point xq G X, i.e., the group 
H :— {g £ G : gxg = xq} and the isomorphism G/H ^ X is gH i-> gxQ. [See 
[51 p. 59]]. In the following we consider the transitive action of U{d) on CP'^^^ 
and let xq be the point with homogeneous coordinates [0 : • • • : : 1] . Then 
H = U{d-l)x U{1) and CP'^-^ ^ U{d)/[U{d-l) x C/(l)], gTl p. 278]. 

We will show below that the vectors |A, 0) for 

X = {e,0,...,0,-£) (22) 

are exactly the ones being stabilized by U{d — 1) x U{1). For such A, we can 
therefore define the functions ye^m on CP"^^^ by 

yi.mix) := tx^rn,o{g) (23) 

for g E x. Since the measure dg on U{d) descends to a measure dx on CP'^^^, 
these functions are also square integrable and orthonormal with respect to the 
standard inner product. The following theorem, the main statement of this 
section, asserts that these functions span L^(CP''~^) densely. 

Theorem 5. Let ^ e L"^ [CP'^-'^) . Then 

eefi rri 

where the second sum ranges over Gelfand-Zetlin pattern m associated to the 
irreducible representation (^, 0, • • • ,0, — ^) ofU{d). The constants fi{£,m) are 
square summable. 

The proof is based on the following extension of the Peter- Weyl theorem. 
Define 

:= {veVx: h\v) = V/i G H} 
as the H invariant subspace of Vx . 
Theorem 6 (Peter- Weyl theorem). 

L'{G/H)-^V^^Vx" 

X 

A 

where is the completion of the direct sum. A basis for (E) V\ is given by 

tx.m,m' • 
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Proof. See e.g. Corollary 9.14]. □ 

The following lemma characterises the components in the direct sum in terms 
of the functions yi^m- 

Lemma 4. We have 

y* ^ y^uid^i)xuii) ^ Up£in{ye,m} X = {£,0,0, . . . ,0, -I) 



oth 



erwise . 



Proof. Since t\^rn,m'{g) — d\{X,m\g\X,m') , it suffices to show that the vectors 
I A, 0) with A as in the statement are exactly the vectors fixed by C/((i— 1) x ?7(1). 
The claim is therefore equivalent to 



Vx 



U{d-l)xU{l) 



fspan{|A,0)} X = {i,0,0, . . . ,0, -i) 
I otherwise . 



It follows from the branching rule and simple counting of degrees of the 
polynomials that 

iu[d-l)xU{l)- ^A' ® 

where the sum extends over fi that are interlaced by A and V|>,|_j^| is the one- 
dimensional representation of J7(l) with weight |A| — \n\. Since = (Vx Ih)^ ^ 
we find 

yU(d^l)xUil) _ (y I U(d) \ U{d-l)xU{l) ^ u{d-l) ^ yU(l) 

yU{d-i) gxactly nonzero when is the trivial representation, i.e. /i — (C^^^). 
Likewise, is nonvanishing only when V|A|-|/i| is the trivial represen- 

tation of U{1), i.e. |A| — = 0. Note that (0)''^^ interlaces A only when 
A = (Ai, 0, . . . , 0, Xd) and that |A| = furthermore implies Ai -I- A^; = 0. Set- 
ting Xx—t completes the proof. □ 

Proof of Theorem]^ We apply Theorem |6] to G = U{d) and H ^ U{d- 1) x 
U{1). Recalling that in this case G/H = CP''^^ the left hand side becomes 
L'^ {CP'^~^) . According to Lemma Wl the right hand side equals the space 
spanned by the functions yi^m- This concludes the proof. □ 

We now want to relate the multiplication of the t functions to the Clebsch- 
Gordan coefficients of U{d). In general we have the decomposition 
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where U{d) is embedded diagonally into U{d) x U{d), i.e. U{d) B g i-^ g x g G 
U{d) X U{d). The multiplicities y are the well-known Littlewood- Richardson 
coefficients (see e.g. [H]). In terms of a basis transform this isomorphism reads 

I A, m) I A', to') = (A, A', r, A", to"|A, to) |A', to')|A, A', r, A", to") 

A" .m" ,r 

with the U{d) Clebsch-Gordan coefficients (A, A', r, A", to"|A, to) |A', to'), where 
r counts the different copies of Vy . The following lemma relates the product of 
two functions t\^m,o and iA',m',o to the U{d) Clebsch-Gordan coefficients. More 
generally, such a formula can be derived for the product of tA,m,m and iA,m',m' 
functions (see |42l Chapter 18.2.1]). 



Lemma 5. 



tx,ni.,oig)tx'.jn',o{g) = ^ 



A A' A" 

TO to' to" 



,0(5) 



where 



A A' A" 

TO to' to" 



= / tx,m,o{g)tx' ,,n' A9)tx" .m" Ag)dg 



dxd 



(24) 



^ K](A, A', r. A", 0| A, 0) |A', 0) (A, to|(A', to'|A, A', r, A", to") 



Proof. 



tx,m,,o{g)tx',7n',o{9) = c^A^^A' ( A, m|.g| A, 0) ( A' , to' jg] A' , 0) 

= dAdA'((A,TO|(A',TO'|)5(|A,0)|A',0)) 

= dAdA' ^ (A,A',r,A",0|A,0)|A',0) 



E 



A" ,m" ,r 

X (A, to|(A', to'IA, A', r. A", to") (A", TO"|.g| A", 0) 
' A A' A" " 



TO TO TO 



iA",m",o(5) 



□ 

This leads to an important product formula which we use, for instance, in 
the update rule. 

Corollary 1. 



where 



TO TO TO 



ilarly for \' and X" (see (2A)). 



i" m" 

A A' A" 



TO TO TO 



TO TO TO 



/or A = (i?, 0, . . . , 0, — £) HTirf sim- 
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Corollary 2. 

dim(n,d)|(d|x)p" = 

where 



1 



dim(n, d) 



E 







2/£.o(a:) 







dim(n, d)^ 



|(j.,i^*,A,0|i.,0)|i/*,0)|^ 







/or A = 0, . . . , 0, —£) with £ < n. Furthermore, 
and vanishes for i > n. 

Proof. Note that dim(n, d) = du — d^- . Then 

^dl{v,Q\g\v,(S){v,Q\g\v,Q) 



(25) 



(26) 



7^ /or ^ < n 



= tufifl{x)tvfifi{x) = t,y,o,o(a;)i,y',o,o(2:) 

since jd)®" is a weight vector in = (n, 0, . . . , 0) that is invariant with respect to 
the subgroup U{d — 1^ and therefore has a Gelfand-Zethn pattern to = 0. The 
invariance with respect U{d~l) follows from the tensor production action of the 
group U(d) as well as the fact that the stabilizer of \d) contains U{d— 1). (261 
holds since the Littlewood-Richardson coefficient ^* equals one for £ < n and 

z^* A 

vanishes for larger values of £|14|. This implies in particular that 
vanishes for £ > n. In order to see that 




does not vanish for 



V V* \ 



smaller values of £, note that the projection of O)]!'*, 0) onto the irreducible 
representation A = (^, 0, . . . , 0, ~t) is given by 

Pa I J^, 0) I i^* , 0) = ^ ( (i., i.* , A, TO 1 1/, 0) I z.* , 0) ) I z., i/* , A , m) 

m 

where we used the fact that the sum can only contain H invariant vectors. 
The claim follows since we know that this projection cannot vanish, as the 
Littlewood-Richardson coefficient is nonzero for all £ < n. □ 

Lemma 6. For g E zH, where H = U{d — 1) x U{1), we have 



^embedded into U(d) by inclusion into the top left corner 
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Proof. 



yifiig^ x)yi^m{x)dx = dxdy J {X,0\g^ g\\,0){\' ,0\g^\X' ,m)dg 



□ 



The next corollary generalises the decomposition in Corollary [2j It is needed 
in some technical aspects of the paper as well as the examples. 



Corollary 3. 

dim(n,d)|(z|a;)|2" = 



dim(n, d) 



E 



V V* \ 





where v = (n, 0, . . . , 0) and A = 0, . . . , 0, — i?). The coefficients are defined 
in pil). 



Proof. By Corollary [2] and ( 24 ) we have 



1 " 



V V* \ 



V V* \ 





vi,o{g'^x) 



(27) 



where g € zH (\x) = g\d))aiY(i where we used Lemma[6]in the last equation. □ 



E Recovering the Spherical Harmonics on the 
Bloch Sphere 

In the following we restrict our attention to the special case d = 2. The complex 
projective space CP^ can be viewed as the sphere S'^ with x e CP^ being 
represented as a point on the sphere parametrised by angles 9 € [0, tt] and 
(j) G [0, 2tt). The measure dx turns into ^ sin 6d9d(f>. As a unitary representative 
for X, g Cz xH, we choose 

This implies \x) — g\2) = ie^i sin + e~'^ cos ||2). where we used 




_(e't \/cosf isin|\ , , 
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We may think of = as the south pole and 9 = ir as the north pole of the 
sphere (when the z direction is the rotation axis of the earth). 

It is then natural to expect that the yi^m are related to the ordinary spherical 
harmonics on the sphere. The next lemma provides us with the precise depen- 



deuce. Thereafter we will find a formula for the coefficients 



that 



govern the multiplication of two functions. 

Before we start, note that for d — 2 the possible Gelfand-Zetlin patterns m 
for A = {£, ~£) lie in the interval —£ < m < i and that the spin projection in 
z-directiorj^ of the state |A,m) equals m (cf. Lemma [sj. 

Lemma 7. For d — 2 

where P™ are the associated Legendre polynomials and 



vr«',*):=v'^^w'|^pr(-«)e- 



(^ + m)! 

are the spherical harmonics^ 

Proof. By [40l eq (1) in Chapter 6.3.1] 

(^,m|g|£,0) =t_„,o(5). 



Using (28) and gOl eq (2) & (4) in Chapter 6.3.3 and eq (3) in Chapter 6.3.7.; 



we see that for positive m 



e™n-.)"^|^pr(cos.) 

holds, where P™ denote the associated Legendre polynomials. The same formula 
can be seen to hold for negative m by use of [40j eq (2') in Chapter 6.3.6 and 
eq (3') in Chapter 6.3.7]. 

□ 



We will now find formulae for the 



by relating them to the 



^ ^ ^ 

Clebsch-Gordan coefficients of SU (2) for which closed formulae are known. We 
start by relating the Clebsch-Gordan coefhcients of SU{2) and U{2). 

Lemma 8. Let d ~ 2. If c'^ y ^ Q then 

(A, A', A", m"|A, m)|A', mV(2) - (i, L' , L" , M"\L, M)\L' , M')sui2) 



^i.e. the eigenvalue of the (Lie algebra) representation of the operator |(Tz 
*Note that we are using a standard convention also used in Mathematica. 
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where 

A==(Ai,A2) L= M^m 

and likewise for the primed variables. 
Proof. If cVx® Va', then 

rr I (7(2) , , (7(2) -J , I (7(2) 

V\" 4-5(7(2)^ ^S(7(2) -^5(7(2)' 

Since V>, 4-5^^(2) equivalent to a spin— L representation (with L as in the 
claim) we can obtain the Clebsch-Gordan coefficients for U{2) from those of 
SU{2). This works as follows. The mapping of the basis state in the irreducible 
representation A = (Ai,A2) with Gelfand-Zetlin pattern (m) is 

|A,77l)c/(2) -> \L,M)sU{2), 

where L and M are defined as in the statement of the claim since the weight of a 
Gelfand-Zetlin pattern m in a representation A equals {wi,W2) = (m, A1-I-A2— to) 
and the spin projection M along the z-direction equals "^"^^ . This concludes 
the proof. □ 

Lemma 9. Let d — 2 and A — {£, —£), v — {n, 0) and i < n. If i is even, then 

A,0|(.,0)|.*, 0)^(2) = /|^ ^r^==^ (29) 
Vn + 1+1 v (n - iy.(n + ly. 

and zero otherwise. If n and I are even 



{v,v\\,Q\v, -)\v\~-)u(2) = 



and zero otherwise. 
Proof. By Lemma |8] 

(.,.*,A,0|z.,0)|.*,0) = (^,^,^,0|^,-|)|^,|). 

Using the formula 40, eq (4) in Chapter 8.2.4] 

(£,f,r,£-£'|^,^)|f,-/)s[/(2) = 



(2£" + l)(2^)!(2f)! 

{I + e - + 1' + 1" + 1)\ 



we find 



V2iTl 

> "1 77 ' 



'2' 2' ' '2' 2"2' 2' Vn + £+l ^{n-iy.{n + iy. 
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This concludes the proof of (29). By Lemma [s] 

71 Tl Tl Th Tl Tl 

(^., A, 01^., -) 1^.* ,--) ^(2) = (-,-,£, 0| -, 0) I -, 0)s[/(2) . 
Using the formula 40, eq (8) in Chapter 8.2.6] 

where 2g -.^ 1 + 1' + 1" is even (the coefficient vanishes for odd 2g) and (see 
eq (3) in Chapter 8.1.3] 



(30) 



A(^,/,/' 



• + - -I' + e")\{i' - £ + (")\ 



y (^ + £' + £" + !)! 

li t — C the coefficient vanishes unless I" is even. If n and I are even we find 



,n n „ ,n , , n 



Otherwise the coefficient vanishes. 

Corollary 4. For A = (^, ^t) and v — {n, 0) and £ and n even we have 



□ 



1 



V V 

n n 

2 2 



A 




4^ 



J/ I/* A 




n!(n + 1)! 



e-i 

n + 2 + i 



{n - e)l{n + £ + ly. 



Proof. By Lemma |9] we have 
(-1) 



(^)!(i)!2(n + l+£)! 

^ (f-i + i)---(f + 

(n + 2)---(n + £+l)(|)!2 
_ 1 e {n-e + 2){n-e + A)---{n + £)£! 
~^2' (n + 2)---(n + ^ + l)|!2 

= (^/ A(i _ ^)(1 - . . . (1 - 



1 

n + 2 



which proves the first formula. The second formula follows from Corollaries |8] 
andO The estimate derives from 



(n + l)!7i! 



{n-£+l)---n 



(n-^)!(n+l+^)! {n + 2) ■ ■ ■ {n + 1 + £) 



> 



n-£+l 



> 1 - 



£{£ + !) 
n + 2 



□ 
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Finally, wc compute the Fourier coefficients of the distribution that is uni- 
form on the equator of the Bloch sphere. 

Lemma 10. Let d — 2 and let ijl{x) he the distribution that is uniformly con- 
centrated on the equator of the Bloch sphere, i.e. 



I n{x)f{x)dx = ^ [ f{x4,)d(l) 

J y[o,27r) 



for all test functions f{x), where 

_ (e"^/2|l) + e-"^/^|2))(e-"^/'(0| + e-"^/^(l|) 
x<^- 2 

are the points on the equator. Then 

K^) = Y.i-^)H\Y(^^yifi{x) 

Proof. Note that /Lt(a;) = fi{hx) for all /i € = C/(l) x i.e. fj,isH invariant. 
The nonvanishing Fourier components must therefore also be H invariant, which 
implies that only the ones where m = can be nonzero. 
The remaining coefficients are 



/ 



Kx)y\fl{x)dx = ^ I d(j)yx,o{x^) 

^ J #(A,0|5^|A,0) 



27r 

= (A,0|( 4 V^^o) 
where we chose a representative from the coset x^H: 

/ e''t>/^ \ 

and used (A,0| = (A, 0| I ^ g-i<t>/2 1 the last line. The calculation of 
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the last term is a combinatorial feast: 



(A,0| ( h ) |A,0) = (A,0| ( f A J l^'O) 



V2 \/2 



z \ \/2 \/2 / 

= (;^)''E(-l|i + 2)^'|i-2r 

= l^(z||ll-22r 

Z 

= ^(-i)-(l 

if i is even. Otherwise the last formula vanishes. The sums over z and z' extend 
over all binary strings (with symbols 1 and 2) of length 2(. with Hamming weight 
t □ 



F Relation to Maximum Likelihood Estimation 

In order to study the relation of our method to the well-known MLE, we consider 
the practically relevant case, where -B" is of product form, i.e. 



i=l 



where Ei the elements of a POVM, i.e. > and = 1- / = 

{f^^\ . . . , f'^^^) is the vector containing the frequencies with which the outcomes 
occur, i.e. e N and X;r=i /^*^ = We find 



MBn(a) = — n(tri;,a/'' 
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where cb" = / tr B"a^'^. We now want to investigate the maximum of this 
function which we assume for simphcity to be unique. Since the log function is 
concave, the a which maximises this function can also be expressed as 

Cmax := argmax^(^ f'^ log tiE.a), (31) 

i 

where /^*-' = are the relative frequencies. This shows that the value at which 
/Zfin is maximised coincides with the density matrix that the MLE method infers 
since J*^*^ logtri^icr is the so-called "log likelihood measure" [531 Eqs. (2) and 
(3)]. 

The MLE method is therefore consistent with our method and our work can 
be seen as its theoretical justification. We emphasize that in contrast to MLE, 
our work shows how to compute reliable error bars. This implies in particular 
that the likely state (i.e. the state within error bars) is not on the boundary 
of the state space, even though the maximum might lie on the boundary. Our 
method can therefore be seen as a resolution of the "problem" that the states 
predicted by MLE are unphysical because they lie on the boundary. 



G Exponential Decay around the Maximum 

We now want to study the decay of the estimate density around its maximum. 
In general this is not easy, as the maximum might lie on the boundary of the set. 
Useful statements can be made however, when the maximum is in the interior. 
We use the same conventions as in Appendix |F] and consider the exponent 

^P'lntr£;,(T (32) 

i 

and search for the extreme points of the function 

- lntr£;,cr + c(trcr - 1) 

i 

where we introduced the Lagrange multiplier c in order to take care of the 
normalisation of the density matrix. The extreme points are characterised by 
the equations 

i 

trcr = 1. 

Let us for simplicity restrict to the case, where the are linearly independent, 
then, since the Ei form a POVM, we have 

= tiE^a (33) 

as a condition for an extreme point. Note that the decay exponent of the 



normalisation cb" is (for large n) equal to the maximum of (32) (which equals 
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~ Si /^'■' log/*'*'') ■ The exponent of the estimate density InfiB" is therefore 
asymptotically equal to the negative of the relative entropy 



DifWEia)) - ^/«(ln/(*^ - lntr£;,a) 

i 

where E{a) ~ (tiEia,--- ^trE^a) (this shows in particular that we found a 
maximum, since the relative entropy is nonnegative) . By Pinsker's inequality 

Dif\\E{a))>l\\f~Eia)\\l 

This implies that the error bars around the maxima are given by 

1 

in the norm on the set of density matrices induced by the norm 
\\X\\:^\\E{X)\U = J2\irE,X\ 

i 

Note that if in addition, the POVM is tomographically complete (i.e. the real 
span of its elements are a basis for the Hermitian operators), then the maximum 
is unique and can be determined by solving the condition ( 33 ) . 
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